Goto

Collaborating Authors

 access link


Throughput-OptimalTopology Design forCross-SiloFederatedLearning

Neural Information Processing Systems

Federated learning (FL) "involves training statistical models over remote devices or siloed data centers,suchasmobile phones orhospitals, whilekeepingdatalocalized"[56]because ofprivacy concerns orlimitedcommunication resources. Hence, clients only communicate with apotentially far-away (e.g., in another continent) orchestrator and do not Recent experimental and theoretical work suggests that, in practice,the first effect has been over-estimated by classic worst-caseconvergencebounds.


Throughput-Optimal Topology Design for Cross-Silo Federated Learning

Neural Information Processing Systems

Federated learning usually employs a client-server architecture where an orchestrator iteratively aggregates model updates from remote clients and pushes them back a refined model. This approach may be inefficient in cross-silo settings, as close-by data silos with high-speed access links may exchange information faster than with the orchestrator, and the orchestrator may become a communication bottleneck. In this paper we define the problem of topology design for cross-silo federated learning using the theory of max-plus linear systems to compute the system throughput---number of communication rounds per time unit. We also propose practical algorithms that, under the knowledge of measurable network characteristics, find a topology with the largest throughput or with provable throughput guarantees.



opXRD: Open Experimental Powder X-ray Diffraction Database

Hollarek, Daniel, Schopmans, Henrik, Östreicher, Jona, Teufel, Jonas, Cao, Bin, Alwen, Adie, Schweidler, Simon, Singh, Mriganka, Kodalle, Tim, Hu, Hanlin, Heymans, Gregoire, Abdelsamie, Maged, Hardiagon, Arthur, Wieczorek, Alexander, Zhuk, Siarhei, Schwaiger, Ruth, Siol, Sebastian, Coudert, François-Xavier, Wolf, Moritz, Sutter-Fella, Carolin M., Breitung, Ben, Hodge, Andrea M., Zhang, Tong-yi, Friederich, Pascal

arXiv.org Artificial Intelligence

Powder X-ray diffraction (pXRD) experiments are a cornerstone for materials structure characterization. Despite their widespread application, analyzing pXRD diffractograms still presents a significant challenge to automation and a bottleneck in high-throughput discovery in self-driving labs. Machine learning promises to resolve this bottleneck by enabling automated powder diffraction analysis. A notable difficulty in applying machine learning to this domain is the lack of sufficiently sized experimental datasets, which has constrained researchers to train primarily on simulated data. However, models trained on simulated pXRD patterns showed limited generalization to experimental patterns, particularly for low-quality experimental patterns with high noise levels and elevated backgrounds. With the Open Experimental Powder X-Ray Diffraction Database (opXRD), we provide an openly available and easily accessible dataset of labeled and unlabeled experimental powder diffractograms. Labeled opXRD data can be used to evaluate the performance of models on experimental data and unlabeled opXRD data can help improve the performance of models on experimental data, e.g. through transfer learning methods. We collected 92552 diffractograms, 2179 of them labeled, from a wide spectrum of materials classes. We hope this ongoing effort can guide machine learning research toward fully automated analysis of pXRD data and thus enable future self-driving materials labs.


Throughput-Optimal Topology Design for Cross-Silo Federated Learning

Neural Information Processing Systems

Federated learning usually employs a client-server architecture where an orchestrator iteratively aggregates model updates from remote clients and pushes them back a refined model. This approach may be inefficient in cross-silo settings, as close-by data silos with high-speed access links may exchange information faster than with the orchestrator, and the orchestrator may become a communication bottleneck. In this paper we define the problem of topology design for cross-silo federated learning using the theory of max-plus linear systems to compute the system throughput---number of communication rounds per time unit. We also propose practical algorithms that, under the knowledge of measurable network characteristics, find a topology with the largest throughput or with provable throughput guarantees. Speedups are even larger with slower access links.


Apple 'Foliar' Disease Detection Analysis 🍎🌳

#artificialintelligence

Analyze the Plant Pathology 2020 dataset to build a CNN-based multi-class classification deep learning model that can predict the most common diseases in apple tree leaves. It all starts with the plantation of seeds. Then, they change into a seedling which grows into adult apple tree. An adult apple tree grows flowers. And, flowers make fruits with seeds.


Automated Inference System for End-To-End Diagnosis of Network Performance Issues in Client-Terminal Devices

Widanapathirana, Chathuranga, Şekercioǧlu, Y. Ahmet, Ivanovich, Milosh V., Fitzpatrick, Paul G., Li, Jonathan C.

arXiv.org Artificial Intelligence

Traditional network diagnosis methods of Client-Terminal Device (CTD) problems tend to be laborintensive, time consuming, and contribute to increased customer dissatisfaction. In this paper, we propose an automated solution for rapidly diagnose the root causes of network performance issues in CTD. Based on a new intelligent inference technique, we create the Intelligent Automated Client Diagnostic (IACD) system, which only relies on collection of Transmission Control Protocol (TCP) packet traces. Using soft-margin Support Vector Machine (SVM) classifiers, the system (i) distinguishes link problems from client problems and (ii) identifies characteristics unique to the specific fault to report the root cause. The modular design of the system enables support for new access link and fault types. Experimental evaluation demonstrated the capability of the IACD system to distinguish between faulty and healthy links and to diagnose the client faults with 98% accuracy. The system can perform fault diagnosis independent of the user's specific TCP implementation, enabling diagnosis of diverse range of client devices.


Diagnosing client faults using SVM-based intelligent inference from TCP packet traces

Widanapathirana, Chathuranga, Sekercioglu, Y. Ahmet, Fitzpatrick, Paul G., Ivanovich, Milosh V., Li, Jonathan C.

arXiv.org Artificial Intelligence

In recent years, technological developments in computer networking have predominantly focused on improving connection media speeds and state-of-the-art applications. In tandem with user demand for high-speed delivery of information, tolerance for performance and connectivity issues has decreased. Due to the complexity and scale of modern communications networks that include a multitude of possible client devices, traditional "expert knowledge" or "rule based" methods of performance and fault diagnosis are increasingly inefficient and infeasible. Analysis of packet traces, especially from the Transmission Control Protocol (TCP), is a sophisticated inference based technique used to diagnose complicated network problems in specialized cases. TCP traces contain artifacts related to behavioral characteristics of network elements that a skilled investigator can use to infer the location and root cause of a network fault.